Skip to content

M8b: $formatInteger + $parseInteger (XPath integer pictures)#27

Merged
flearc merged 6 commits into
mainfrom
feature/m8b-formatinteger
Jun 26, 2026
Merged

M8b: $formatInteger + $parseInteger (XPath integer pictures)#27
flearc merged 6 commits into
mainfrom
feature/m8b-formatinteger

Conversation

@flearc

@flearc flearc commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Summary

  • $formatInteger(number, picture) + $parseInteger(string, picture) — a faithful Lua port of jsonata-js v2.2.1's XPath/XSLT fn:format-integer picture machinery, in a new module src/jsonata/functions/formatinteger.lua. Both directions for all four format families:
    • decimal-digit-pattern — Unicode digit families (Arabic-Indic ١, fullwidth , …), regular + irregular grouping, ordinal ;o suffixes (st/nd/rd/th with the 11–13th teen exception), D3131 mixed-family.
    • roman (I/i), letters (a/A, bijective base-26), words (w/W/Ww, cardinal + ordinal, magnitudes clamped to trillion).
  • Shared analyse_integer_picture + the four converter pairs; internals exported (_internal) for M8c ($formatDateTime/$parseDateTime) reuse.
  • Hoists codepoint/from_codepoint into H (shared with $formatNumber). D3130/D3131 errors.

Suite impact

  • function-formatInteger 2 → 65/65; function-parseInteger 1 → 61/61.
  • Official suite 1324 → 1447 / 1682 (≈86.0%), zero regressions (guard green).

Test Plan

  • 561/561 busted unit specs green; spec/formatinteger_spec.lua covers every category both directions + large ints, families, ordinal/teen edges, D3130/D3131, NaN-on-garbage.
  • Official-suite regression guard green.
  • Adversarial oracle review (per-category + final pass) vs jsonata-js v2.2.1 — 2 bugs found & fixed (sci-notation large ints; words-parse crash → NaN); jsonata's hardcoded-comma parse quirk confirmed faithful.

🤖 Generated with Claude Code

flearc and others added 6 commits June 26, 2026 02:15
New functions/formatinteger.lua: analyse_integer_picture + decimal-digit-pattern
format/parse (Unicode digit families, regular+irregular grouping, ordinal suffix
with the teen exception, D3131 mixed-group). Registers $formatInteger <n-s:s> /
$parseInteger <s-s:n>; roman/letters/words branches follow. Hoists
codepoint/from_codepoint into H (shared with formatnumber, and M8c next).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
format_decimal used tostring(value) which yields scientific notation for large
integer-valued floats (1000000000000000 -> "1e+15"). Use H.num_to_str, which
formats integer-valued numbers via %d. Matches the oracle.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
decimalToRoman/romanToDecimal (subtractive) and bijective base-26
decimalToLetters/lettersToDecimal, both directions; D3130 for unsupported
sequence tokens.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ections

numberToWords recursive engine (magnitudes to trillion, clamped; and/comma
joins; ordinal mutations) + wordsToNumber (wordValues table + segment-accumulate)
for w/W/Ww. Completes $formatInteger/$parseInteger.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…1, zero regressions

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
words_to_number crashed (nil < 100) on unrecognized tokens; jsonata's
wordsToNumber maps them to undefined, whose lenient coercion yields NaN. Mirror
that (nil routes to the multiply branch, * NaN). Matches the oracle (null).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@flearc flearc merged commit 29be84f into main Jun 26, 2026
1 check passed
@flearc flearc deleted the feature/m8b-formatinteger branch June 26, 2026 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant